{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hOFbw09Cle00"
   },
   "source": [
    "<center>\n",
    "    <p style=\"text-align:center\">\n",
    "        <img alt=\"phoenix logo\" src=\"https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg\" width=\"200\"/>\n",
    "        <br>\n",
    "        <a href=\"https://arize.com/docs/phoenix/\">Docs</a>\n",
    "        |\n",
    "        <a href=\"https://github.com/Arize-ai/phoenix\">GitHub</a>\n",
    "        |\n",
    "        <a href=\"https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email\">Community</a>\n",
    "    </p>\n",
    "</center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZEe1jfDIluor"
   },
   "source": [
    "# Langgraph - Prompt Chaining"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "698F5HJ4lmZp"
   },
   "source": [
    "This notebook demonstrates how to use prompt chaining with LangGraph to build a multi-step email assistant. The assistant guides the writing process through three distinct stages:\n",
    "\n",
    "- Generating an outline based on subject and bullet points\n",
    "\n",
    "- Writing the initial draft using the outline and desired tone\n",
    "\n",
    "- Refining tone if needed to match the specified style\n",
    "\n",
    "This approach enables fine-grained control over the content generation process by decomposing the task into logical steps. Each stage in the graph is handled by a separate node, enabling targeted prompting, intermediate outputs, and conditional logic.\n",
    "\n",
    "In addition, the entire workflow is instrumented with Phoenix, which provides OpenTelemetry-powered tracing and debugging. You can inspect each step’s inputs, outputs, and transitions directly in the Phoenix UI to identify bottlenecks or missteps in generation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install langgraph langchain langchain_community \"arize-phoenix==9.0.1\" arize-phoenix-otel openinference-instrumentation-langchain"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "sJ95Ayd3Qh4D"
   },
   "source": [
    "This is a template for prompt chaining with LangGraph. It is an email writer, with 3 steps: writing an outline, writing the email, and refining tone."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "from getpass import getpass\n",
    "\n",
    "from langgraph.graph import END, START, StateGraph"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"OPENAI_API_KEY\"] = getpass(\"🔑 Enter your OpenAI API key: \")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "UQQuGd5Hl10b"
   },
   "source": [
    "# Configure Phoenix Tracing\n",
    "\n",
    "Make sure you go to https://app.phoenix.arize.com/ and generate an API key. This will allow you to trace your Langgraph application with Phoenix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if \"PHOENIX_API_KEY\" not in os.environ:\n",
    "    os.environ[\"PHOENIX_API_KEY\"] = getpass(\"🔑 Enter your Phoenix API key: \")\n",
    "\n",
    "if \"PHOENIX_COLLECTOR_ENDPOINT\" not in os.environ:\n",
    "    os.environ[\"PHOENIX_COLLECTOR_ENDPOINT\"] = getpass(\"🔑 Enter your Phoenix Collector Endpoint\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from phoenix.otel import register\n",
    "\n",
    "tracer_provider = register(project_name=\"Prompt Chaining\", auto_instrument=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "-GX_aNJPl_-q"
   },
   "source": [
    "# LLM of choice"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.chat_models import ChatOpenAI\n",
    "from typing_extensions import Literal, TypedDict\n",
    "\n",
    "llm = ChatOpenAI(model=\"gpt-4o-mini\", temperature=0.3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "O0TJ7x9LmCW4"
   },
   "source": [
    "# Defining Graph State\n",
    "\n",
    "The *EmailState* defines the shared memory for our email-writing agent. Each field represents the evolving state of the email — starting from the user’s initial inputs (subject, notes, tone) and gradually building up through the stages of outline generation, drafting, and final tone refinement. This state dictionary is passed between nodes to ensure context is maintained and updated incrementally throughout the workflow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class EmailState(TypedDict, total=False):\n",
    "    subject: str\n",
    "    bullet_points: str  # raw user notes\n",
    "    desired_tone: str  # \"formal\", \"friendly\", etc.\n",
    "    outline: str  # result of node 1\n",
    "    draft_email: str  # result of node 2\n",
    "    final_email: str  # after tone reformer (if needed)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "yrPyNT5XmO7u"
   },
   "source": [
    "# Step-by-Step Prompt Chain: Outline → Draft → Tone Check\n",
    "\n",
    "This workflow chains multiple LLM calls to transform raw notes into a polished email:\n",
    "\n",
    "**generate_outline**: Converts user bullet points into a structured outline.\n",
    "\n",
    "**write_email**: Expands the outline into a complete email draft using the desired tone.\n",
    "\n",
    "**tone_gate**: Checks if the draft meets the requested tone using a lightweight LLM classification.\n",
    "\n",
    "**reform_tone**: If the tone doesn't match, this node rewrites the draft while preserving the content.\n",
    "\n",
    "Each node is modular, enabling targeted debugging and reuse across different tasks or formats. This multi-step refinement mirrors human drafting processes and produces higher-quality outputs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def generate_outline(state: EmailState) -> EmailState:\n",
    "    \"\"\"LLM call 1 – produce an outline from bullet points.\"\"\"\n",
    "    prompt = (\n",
    "        \"Create a concise outline for an email.\\n\"\n",
    "        f\"Subject: {state['subject']}\\n\"\n",
    "        f\"Bullet points:\\n{state['bullet_points']}\\n\"\n",
    "        \"Return the outline as numbered points.\"\n",
    "    )\n",
    "    outline = llm.invoke(prompt).content\n",
    "    return {\"outline\": outline}\n",
    "\n",
    "\n",
    "def write_email(state: EmailState) -> EmailState:\n",
    "    \"\"\"LLM call 2 – write the email from the outline.\"\"\"\n",
    "    prompt = (\n",
    "        f\"Write a complete email using this outline:\\n{state['outline']}\\n\\n\"\n",
    "        f\"Tone: {state['desired_tone']}\\n\"\n",
    "        \"Start with a greeting, respect professional formatting, and keep it concise.\"\n",
    "    )\n",
    "    email = llm.invoke(prompt).content\n",
    "    return {\"draft_email\": email}\n",
    "\n",
    "\n",
    "def tone_gate(state: EmailState) -> Literal[\"Pass\", \"Fail\"]:\n",
    "    \"\"\"\n",
    "    Gate – quick heuristic:\n",
    "      Pass  → email already includes the required tone keyword.\n",
    "      Fail  → otherwise (we’ll ask another LLM call to adjust).\n",
    "    \"\"\"\n",
    "    prompt = (\n",
    "        f\"Check whether the following email matches the desired tone {state['desired_tone']}:\\n\\n\"\n",
    "        f\"{state['draft_email']}\\n\"\n",
    "        f\"If it does, return 'Pass'. Otherwise, return 'Fail'.\"\n",
    "    )\n",
    "    return llm.invoke(prompt).content.strip()\n",
    "\n",
    "\n",
    "def reform_tone(state: EmailState) -> EmailState:\n",
    "    \"\"\"LLM call 3 – rewrite the email to fit the desired tone.\"\"\"\n",
    "    prompt = (\n",
    "        f\"Reform the following email so it has a {state['desired_tone']} tone.\\n\\n\"\n",
    "        f\"EMAIL:\\n{state['draft_email']}\\n\\n\"\n",
    "        \"Keep content the same but adjust phrasing, word choice, and sign‑off.\"\n",
    "    )\n",
    "    final_email = llm.invoke(prompt).content\n",
    "    return {\"final_email\": final_email}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "RW-eSAYMma6R"
   },
   "source": [
    "# Compiling the Email Prompt Chain with LangGraph\n",
    "\n",
    "Here we assemble the full LangGraph that represents our email generation pipeline. The graph begins at the outline_generator, moves to the email_writer, and conditionally routes to tone_reformer only if the tone check fails. This structure demonstrates the prompt chaining pattern with a dynamic control flow—adapting based on the model’s output. Once compiled, this graph can be invoked on user input and traced using Phoenix for debugging or optimization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "graph = StateGraph(EmailState)\n",
    "\n",
    "graph.add_node(\"outline_generator\", generate_outline)\n",
    "graph.add_node(\"email_writer\", write_email)\n",
    "graph.add_node(\"tone_reformer\", reform_tone)\n",
    "\n",
    "# edges\n",
    "graph.add_edge(START, \"outline_generator\")\n",
    "graph.add_edge(\"outline_generator\", \"email_writer\")\n",
    "graph.add_conditional_edges(\n",
    "    \"email_writer\",\n",
    "    tone_gate,\n",
    "    {\"Pass\": END, \"Fail\": \"tone_reformer\"},\n",
    ")\n",
    "graph.add_edge(\"tone_reformer\", END)\n",
    "\n",
    "email_chain = graph.compile()\n",
    "\n",
    "# ────────────────────────────────────────────────\n",
    "# 4. Visualize & run once\n",
    "# ────────────────────────────────────────────────\n",
    "# display(Image(email_chain.get_graph().draw_mermaid_png()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "X5R3Ywnkmmr4"
   },
   "source": [
    "# Example Usage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "initial_state = email_chain.invoke(\n",
    "    {\n",
    "        \"subject\": \"Quarterly Sales Recap & Next Steps\",\n",
    "        \"bullet_points\": \"- Q1 revenue up 18%\\n- Need feedback on new pricing tiers\\n- Reminder: submit pipeline forecasts by Friday\",\n",
    "        \"desired_tone\": \"friendly\",\n",
    "    }\n",
    ")\n",
    "\n",
    "print(\"\\n========== EMAIL ==========\")\n",
    "print(initial_state.get(\"final_email\", initial_state[\"draft_email\"]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9PPjScEzmphR"
   },
   "source": [
    "# Make sure to view your traces in Phoenix!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZUOKeaDjb3o8"
   },
   "source": [
    "# Let's add some Evaluations (Evals)\n",
    "\n",
    "In this section we will evaluate each agent's success at its defined task.\n",
    "\n",
    "- Outline Generation: Evaluating clarity/structure and relevance.\n",
    "- Email Writing: Evaluating grammar/spelling and content coherence.\n",
    "- Tone Checking: Evaluating whether tone detection correctly passed or failed.\n",
    "- Tone Refinement: Evaluating if the rewritten email matches the desired tone."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "OUTLINE_EVAL_TEMPLATE = \"\"\"\n",
    "You are evaluating an outline created to help write an email.\n",
    "\n",
    "Email Subject: {subject}\n",
    "Bullet Points: {bullet_points}\n",
    "\n",
    "Generated Outline:\n",
    "{outline}\n",
    "\n",
    "Please assess the outline based on the following criteria:\n",
    "1. Clarity & Structure – Is it logically organized and easy to follow?\n",
    "2. Relevance – Does it fully reflect the bullet points provided?\n",
    "\n",
    "Return one of:\n",
    "- 0/2 if both are poor\n",
    "- 1/2 if only one is good\n",
    "- 2/2 if both are good\n",
    "\"\"\"\n",
    "\n",
    "EMAIL_EVAL_TEMPLATE = \"\"\"\n",
    "You are evaluating the quality of an email.\n",
    "\n",
    "Outline Used:\n",
    "{outline}\n",
    "\n",
    "Drafted Email:\n",
    "{final_email}\n",
    "\n",
    "Evaluate the email on:\n",
    "1. Grammar and Spelling – Are there noticeable errors?\n",
    "2. Content Coherence – Does it follow the structure and meaning of the outline?\n",
    "\n",
    "Return one of:\n",
    "- 0/2 if both are poor\n",
    "- 1/2 if only one is good\n",
    "- 2/2 if both are good\n",
    "\"\"\"\n",
    "\n",
    "TONE_REFORM_EVAL_TEMPLATE = \"\"\"\n",
    "You are evaluating whether the tone of the final email matches the target tone.\n",
    "\n",
    "Desired Tone: {desired_tone}\n",
    "\n",
    "Final Email:\n",
    "{final_email}\n",
    "\n",
    "Does the final email match the desired tone?\n",
    "\n",
    "Return \"yes\" or \"no\"\n",
    "\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "zlPQnY4kl5q1"
   },
   "source": [
    "# Pull Spans from Phoenix"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import phoenix as px\n",
    "\n",
    "df = px.Client().get_spans_dataframe(\"name == 'LangGraph'\", project_name=\"Prompt Chaining\")\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.to_csv(\"prompt_chaining_spans\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "V4hAmm_Cl8Vd"
   },
   "source": [
    "# Custom setup for email evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "\n",
    "import pandas as pd\n",
    "\n",
    "from phoenix.evals import OpenAIModel, llm_classify\n",
    "\n",
    "\n",
    "def unpack(row):\n",
    "    blob = json.loads(row[\"attributes.output.value\"])\n",
    "    return pd.Series(\n",
    "        {\n",
    "            \"subject\": blob.get(\"subject\"),\n",
    "            \"bullet_points\": blob.get(\"bullet_points\"),\n",
    "            \"outline\": blob.get(\"outline\"),\n",
    "            \"desired_tone\": blob.get(\"desired_tone\"),\n",
    "            \"draft_email\": blob.get(\"draft_email\"),\n",
    "            \"final_email_raw\": blob.get(\"final_email\"),  # may be None / \"\"\n",
    "        }\n",
    "    )\n",
    "\n",
    "\n",
    "df = df.join(df.apply(unpack, axis=1))\n",
    "\n",
    "\n",
    "def pick_email(row):\n",
    "    fe = row[\"final_email_raw\"]\n",
    "    if fe and fe.strip():\n",
    "        return fe\n",
    "    return row[\"draft_email\"]\n",
    "\n",
    "\n",
    "df[\"final_email\"] = df.apply(pick_email, axis=1)\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vY3zjUqdmGtt"
   },
   "source": [
    "# Generate Evals"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model = OpenAIModel(model=\"gpt-4o\")\n",
    "\n",
    "outline_results = llm_classify(\n",
    "    dataframe=df.dropna(subset=[\"outline\"]),\n",
    "    template=OUTLINE_EVAL_TEMPLATE,\n",
    "    rails=[\"0/2\", \"1/2\", \"2/2\"],\n",
    "    model=model,\n",
    "    provide_explanation=True,\n",
    "    include_prompt=True,\n",
    ")\n",
    "\n",
    "email_results = llm_classify(\n",
    "    dataframe=df.dropna(subset=[\"final_email\"]),\n",
    "    template=EMAIL_EVAL_TEMPLATE,\n",
    "    rails=[\"0/2\", \"1/2\", \"2/2\"],\n",
    "    model=model,\n",
    "    provide_explanation=True,\n",
    "    include_prompt=True,\n",
    ")\n",
    "\n",
    "tone_results = llm_classify(\n",
    "    dataframe=df.dropna(subset=[\"final_email\"]),\n",
    "    template=TONE_REFORM_EVAL_TEMPLATE,\n",
    "    rails=[\"yes\", \"no\"],\n",
    "    model=model,\n",
    "    provide_explanation=True,\n",
    "    include_prompt=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "NQZUpW_VmIS1"
   },
   "source": [
    "# Export Evals to Phoenix!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "outline_results.drop(\n",
    "    columns=[\"prompt\", \"exceptions\", \"execution_status\", \"execution_seconds\"], inplace=True\n",
    ")\n",
    "email_results.drop(\n",
    "    columns=[\"prompt\", \"exceptions\", \"execution_status\", \"execution_seconds\"], inplace=True\n",
    ")\n",
    "tone_results.drop(\n",
    "    columns=[\"prompt\", \"exceptions\", \"execution_status\", \"execution_seconds\"], inplace=True\n",
    ")\n",
    "\n",
    "\n",
    "from phoenix.client import AsyncClient\n",
    "\n",
    "px_client = AsyncClient()\n",
    "await px_client.spans.log_span_annotations_dataframe(\n",
    "    dataframe=outline_results,\n",
    "    annotation_name=\"Outline Evaluation\",\n",
    "    annotator_kind=\"LLM\",\n",
    ")\n",
    "await px_client.spans.log_span_annotations_dataframe(\n",
    "    dataframe=email_results,\n",
    "    annotation_name=\"Email Quality Evaluation\",\n",
    "    annotator_kind=\"LLM\",\n",
    ")\n",
    "await px_client.spans.log_span_annotations_dataframe(\n",
    "    dataframe=tone_results,\n",
    "    annotation_name=\"Tone Evaluation\",\n",
    "    annotator_kind=\"LLM\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
